Hierarchical Density Order Embeddings

نویسندگان

Ben Athiwaratkun

Andrew Gordon Wilson

چکیده

By representing words with probability densities rather than point vectors, probabilistic word embeddings can capture rich and interpretable semantic information and uncertainty (Vilnis & McCallum, 2014; Athiwaratkun & Wilson, 2017). The uncertainty information can be particularly meaningful in capturing entailment relationships – whereby general words such as “entity” correspond to broad distributions that encompass more specific words such as “animal” or “instrument”. We introduce density order embeddings, which learn hierarchical representations through encapsulation of probability distributions. In particular, we propose simple yet effective loss functions and distance metrics, as well as graph-based schemes to select negative samples to better learn hierarchical probabilistic representations. Our approach provides state-of-the-art performance on the WORDNET hypernym relationship prediction task and the challenging HYPERLEX lexical entailment dataset – while retaining a rich and interpretable probabilistic representation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parameter Free Hierarchical Graph-Based Clustering for Analyzing Continuous Word Embeddings

Word embeddings are high-dimensional vector representations of words and are thus difficult to interpret. In order to deal with this, we introduce an unsupervised parameter free method for creating a hierarchical graphical clustering of the full ensemble of word vectors and show that this structure is a geometrically meaningful representation of the original relations between the words. This ne...

متن کامل

Second-Order Word Embeddings from Nearest Neighbor Topological Features

We introduce second-order vector representations of words, induced from nearest neighborhood topological features in pre-trained contextual word embeddings. We then analyze the effects of using second-order embeddings as input features in two deep natural language processing models, for named entity recognition and recognizing textual entailment, as well as a linear model for paraphrase recogni...

متن کامل

Robust Low Rank Kernel Embeddings of Multivariate Distributions

Kernel embedding of distributions has led to many recent advances in machine learning. However, latent and low rank structures prevalent in real world distributions have rarely been taken into account in this setting. Furthermore, no prior work in kernel embedding literature has addressed the issue of robust embedding when the latent and low rank information are misspecified. In this paper, we ...

متن کامل

Radical-Based Hierarchical Embeddings for Chinese Sentiment Analysis at Sentence Level

Text representation in Chinese sentiment analysis is usually working at word or character level. In this paper, we prove that radical-level processing could greatly improve sentiment classification performance. In particular, we propose two types of Chinese radical-based hierarchical embeddings. The embeddings incorporate not only semantics at radical and character level, but also sentiment inf...

متن کامل

Exploring the Effects of External Semantic Data on Word Embeddings

Distributed word embeddings have been a groundbreaking development that is widely used in many deep Natural Language Processing tasks such as Machine Translation, Question Answering and etc. However, despite its success, currently popular word embedding methods such as GloVe, Skipgram or CBOW only consider distributional statistics of words on their own without any external semantic information...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2018

Hierarchical Density Order Embeddings

نویسندگان

چکیده

منابع مشابه

Parameter Free Hierarchical Graph-Based Clustering for Analyzing Continuous Word Embeddings

Second-Order Word Embeddings from Nearest Neighbor Topological Features

Robust Low Rank Kernel Embeddings of Multivariate Distributions

Radical-Based Hierarchical Embeddings for Chinese Sentiment Analysis at Sentence Level

Exploring the Effects of External Semantic Data on Word Embeddings

عنوان ژورنال:

اشتراک گذاری